The Impact of Source–Side Syntactic Reordering on Hierarchical Phrase-based SMT

نویسندگان

  • Jinhua Du
  • Andy Way
چکیده

Syntactic reordering has been demonstrated to be helpful and effective for handling different word orders between source and target languages in SMT. However, in terms of hierarchial PB-SMT (HPB), does the syntactic reordering still has a significant impact on its performance? This paper introduces a reordering approach which explores the { (DE) grammatical structure in Chinese. We employ the Stanford DE classifier to recognise the DE structures in both training and test sentences of Chinese, and then perform word reordering to make the Chinese sentences better match the word order of English. The annotated and reordered training data and test data are applied to a re-implemented HPB system and the impact of the DE construction is examined. The experiments are conducted on the NIST 2008 evaluation data and experimental results show that the BLEU and METEOR scores are significantly improved by 1.83/8.91 and 1.17/2.73 absolute/relative points respectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Discriminative Latent Variable-Based "DE" Classifier for Chinese-English SMT

Syntactic reordering on the source-side is an effective way of handling word order differences. The { (DE) construction is a flexible and ubiquitous syntactic structure in Chinese which is a major source of error in translation quality. In this paper, we propose a new classifier model — discriminative latent variable model (DPLVM) — to classify the DE construction to improve the accuracy of the...

متن کامل

Source-side Syntactic Reordering Patterns with Functional Words for Improved Phrase-based SMT

Inspired by previous source-side syntactic reordering methods for SMT, this paper focuses on using automatically learned syntactic reordering patterns with functional words which indicate structural reorderings between the source and target language. This approach takes advantage of phrase alignments and source-side parse trees for pattern extraction, and then filters out those patterns without...

متن کامل

Improved Phrase-based SMT with Syntactic Reordering Patterns Learned from Lattice Scoring

In this paper, we present a novel approach to incorporate source-side syntactic reordering patterns into phrase-based SMT. The main contribution of this work is to use the lattice scoring approach to exploit and utilize reordering information that is favoured by the baseline PBSMT system. By referring to the parse trees of the training corpus, we represent the observed reorderings with source-s...

متن کامل

Practical Approach to Syntax-based Statistical Machine Translation

This paper presents a practical approach to statistical machine translation (SMT) based on syntactic transfer. Conventionally, phrase-based SMT generates an output sentence by combining phrase (multiword sequence) translation and phrase reordering without syntax. On the other hand, SMT based on tree-to-tree mapping, which involves syntactic information, is theoretical, so its features remain un...

متن کامل

Linguistically Annotated BTG for Statistical Machine Translation

Bracketing Transduction Grammar (BTG) is a natural choice for effective integration of desired linguistic knowledge into statistical machine translation (SMT). In this paper, we propose a Linguistically Annotated BTG (LABTG) for SMT. It conveys linguistic knowledge of source-side syntax structures to BTG hierarchical structures through linguistic annotation. From the linguistically annotated da...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010